Development and Application of a Genetic Algorithm for Variable Optimization and Predictive Modeling of Five-Year Mortality Using Questionnaire Data

نویسندگان

  • Lucas J. Adams
  • Ghalib Bello
  • Gerard G. Dumancas
چکیده

The problem of selecting important variables for predictive modeling of a specific outcome of interest using questionnaire data has rarely been addressed in clinical settings. In this study, we implemented a genetic algorithm (GA) technique to select optimal variables from questionnaire data for predicting a five-year mortality. We examined 123 questions (variables) answered by 5,444 individuals in the National Health and Nutrition Examination Survey. The GA iterations selected the top 24 variables, including questions related to stroke, emphysema, and general health problems requiring the use of special equipment, for use in predictive modeling by various parametric and nonparametric machine learning techniques. Using these top 24 variables, gradient boosting yielded the nominally highest performance (area under curve [AUC] = 0.7654), although there were other techniques with lower but not significantly different AUC. This study shows how GA in conjunction with various machine learning techniques could be used to examine questionnaire data to predict a binary outcome.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimization of Plastic Injection Molding Process by Combination of Artificial Neural Network and Genetic Algorithm

Injection molding is one of the most important and common plastic formation methods. Combination of modeling tools and optimization algorithms can be used in order to determine optimum process conditions for the injection molding of a special part. Because of the complication of the injection molding process and multiplicity of parameters and their interactive effects on one another, analytical...

متن کامل

Application of genetic algorithm (GA) to select input variables in support vector machine (SVM) for analyzing the occurrence of roach, Rutilus rutilus, in streams

Support vector machine (SVM) was used to analyze the occurrence of roach in Flemish stream basins (Belgium). Several habitat and physico?chemical variables were used as inputs for the model development. The biotic variable merely consisted of abundance data which was used for predicting presence/absence of roach. Genetic algorithm (GA) was combined with SVM in order to select the most important...

متن کامل

Application of Genetic Algorithm to Determine Kinetic Parameters of Free Radical Polymerization of Vinyl Acetate by Multi-objective Optimization Technique

A Multi-objective optimization procedure has been developed to determine some kinetic parameters of free radical polymerization of vinyl acetate based on genetic algorithm. For this purpose, mathematical modeling of free radical polymerization of vinyl acetate is carried out first and then selected kinetic parameters are optimized by minimizing objective functions defined from comparing exp...

متن کامل

AERO-THERMODYNAMIC OPTIMIZATION OF TURBOPROP ENGINES USING MULTI-OBJECTIVE GENETIC ALGORITHMS

In this paper multi-objective genetic algorithms were employed for Pareto approach optimization of turboprop engines. The considered objective functions are used to maximize the specific thrust, propulsive efficiency, thermal efficiency, propeller efficiency and minimize the thrust specific fuel consumption. These objectives are usually conflicting with each other. The design variables consist ...

متن کامل

Using a combination of genetic algorithm and particle swarm optimization algorithm for GEMTIP modeling of spectral-induced polarization data

The generalized effective-medium theory of induced polarization (GEMTIP) is a newly developed relaxation model that incorporates the petro-physical and structural characteristics of polarizable rocks in the grain/porous scale to model their complex resistivity/conductivity spectra. The inversion of the GEMTIP relaxation model parameter from spectral-induced polarization data is a challenging is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2015